NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron; Sheldon, Daniel (December 2025, 38th Conference on Neural Information Processing Systems (NeurIPS 2024).)

Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries that have been measured privately. Reconstruction is an important subproblem for such mecha- nisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms.
more » « less
Free, publicly-accessible full text available December 15, 2026
Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron; Sheldon, Daniel (December 2024, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available
Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron; Sheldon, Daniel (December 2024, Conference on Neural Information Processing Systems (NeurIPS) 2024)

Full Text Available
Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron N; Sheldon, Daniel (December 2024, NeurIPS)

Full Text Available
Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron; Sheldon, Daniel (December 2024, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available
An Optimal and Scalable Matrix Mechanism for Noisy Marginals under Convex Loss Functions

Xiao, Yingtai; He, Guanlin; Zhang, Danfeng; Kifer, Daniel (December 2023, NeuRIPS)

Full Text Available
Answering Private Linear Queries Adaptively Using the Common Mechanism

https://doi.org/10.14778/3594512.3594519

Xiao, Yingtai; Wang, Guanhong; Zhang, Danfeng; Kifer, Daniel (April 2023, Proceedings of the VLDB Endowment)

When analyzing confidential data through a privacy filter, a data scientist often needs to decide which queries will best support their intended analysis. For example, an analyst may wish to study noisy two-way marginals in a dataset produced by a mechanism M 1 . But, if the data are relatively sparse, the analyst may choose to examine noisy one-way marginals, produced by a mechanism M 2 , instead. Since the choice of whether to use M 1 or M 2 is data-dependent, a typical differentially private workflow is to first split the privacy loss budget ρ into two parts: ρ 1 and ρ 2 , then use the first part ρ 1 to determine which mechanism to use, and the remainder ρ 2 to obtain noisy answers from the chosen mechanism. In a sense, the first step seems wasteful because it takes away part of the privacy loss budget that could have been used to make the query answers more accurate. In this paper, we consider the question of whether the choice between M 1 and M 2 can be performed without wasting any privacy loss budget. For linear queries, we propose a method for decomposing M 1 and M 2 into three parts: (1) a mechanism M * that captures their shared information, (2) a mechanism M′1 that captures information that is specific to M 1 , (3) a mechanism M′2 that captures information that is specific to M 2 . Running M * and M′ 1 together is completely equivalent to running M 1 (both in terms of query answer accuracy and total privacy cost ρ ). Similarly, running M * and M′ 2 together is completely equivalent to running M 2 . Since M * will be used no matter what, the analyst can use its output to decide whether to subsequently run M ′ 1 (thus recreating the analysis supported by M 1 )or M′ 2 (recreating the analysis supported by M 2 ), without wasting privacy loss budget.
more » « less
Full Text Available
Free gap estimates from the exponential mechanism, sparse vector, noisy max and related algorithms

https://doi.org/10.1007/s00778-022-00728-2

Ding, Zeyu; Wang, Yuxin; Xiao, Yingtai; Wang, Guanhong; Zhang, Danfeng; Kifer, Daniel (February 2022, The VLDB Journal)

Full Text Available
DPGen: Automated Program Synthesis for Differential Privacy

https://doi.org/10.1145/3460120.3484781

Wang, Yuxin; Ding, Zeyu; Xiao, Yingtai; Kifer, Daniel; Zhang, Danfeng (November 2021, Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security)

Full Text Available
Optimizing fitness-for-use of differentially private linear queries

https://doi.org/10.14778/3467861.3467864

Xiao, Yingtai; Ding, Zeyu; Wang, Yuxin; Zhang, Danfeng; Kifer, Daniel (June 2021, Proceedings of the VLDB Endowment)
null (Ed.)
In practice, differentially private data releases are designed to support a variety of applications. A data release is fit for use if it meets target accuracy requirements for each application. In this paper, we consider the problem of answering linear queries under differential privacy subject to per-query accuracy constraints. Existing practical frameworks like the matrix mechanism do not provide such fine-grained control (they optimize total error, which allows some query answers to be more accurate than necessary, at the expense of other queries that become no longer useful). Thus, we design a fitness-for-use strategy that adds privacy-preserving Gaussian noise to query answers. The covariance structure of the noise is optimized to meet the fine-grained accuracy requirements while minimizing the cost to privacy.
more » « less
Full Text Available

« Prev Next »

Search for: All records